Vive la Petite Différence! - Exploiting Small Differences for Gender Attribution of Short Texts

نویسندگان

  • Filip Gralinski
  • Rafal Jaworski
  • Lukasz Borchmann
  • Piotr Wierzchon
چکیده

This article describes a series of experiments on gender attribution of Polish texts. The research was conducted on the publicly available corpus called “He Said She Said”, consisting of a large number of short texts from the Polish version of Common Crawl. As opposed to other experiments on gender attribution, this research takes on a task of classifying relatively short texts, authored by many different people. For the sake of this work, the original “He Said She Said” corpus was filtered in order to eliminate noise and apparent errors in the training data. In the next step, various machine learning algorithms were developed in order to achieve better classification accuracy. Interestingly, the results of the experiments presented in this paper are fully reproducible, as all the source codes were deposited in the open platform Gonito.net. Gonito.net allows for defining machine learning tasks to be tackled by multiple researchers and provides the researchers with easy access to each other’s results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Political Marketing – Vive La Différence!

Introduction The explicit use of techniques in politics which we would now describe as marketing dates back at least to 1920 in Britain (Wring, 1994). Since the Saatchi and Saatchi poster – “Labour isn’t working” – it has become commonplace to speak of political marketing, and many marketers have come to believe that there is a direct transference of their concepts and tools to the political ar...

متن کامل

Vive la Différence: Paxos vs. Viewstamped Replication vs. Zab

Paxos, Viewstamped Replication, and Zab are replication protocols that ensure high-availability in asynchronous environments with crash failures. Various claims have been made about similarities and differences between these protocols. But how does one determine whether two protocols are the same, and if not, how significant the differences are? We propose to address these questions using refin...

متن کامل

. L O ] 1 5 A pr 1 99 3 Vive la différence II . The Ax - Kochen isomorphism theorem

We show in §1 that the Ax-Kochen isomorphism theorem [AK] requires the continuum hypothesis. Most of the applications of this theorem are insensitive to set theoretic considerations. (A probable exception is the work of Moloney [Mo].) In §2 we give an unrelated result on cuts in models of Peano arithmetic which answers a question on the ideal structure of countable ultraproducts of Z posed in [...

متن کامل

Plant Asymmetric Cell Division, Vive la Différence!

Although little is known about how asymmetric cell division in plants is regulated, recent discoveries provide a starting point for exploring the mechanisms underlying this process. These studies reveal parallels with asymmetric division in yeast and animals, but also point to regulated cell expansion as a new mechanism of asymmetric division in plants.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016